Monitoring Multivariate Data via KNN Learning
نویسندگان
چکیده
Process monitoring of multivariate quality attributes is important in many industrial applications, in which rich historical data are often available thanks to modern sensing technologies. While multivariate statistical process control (SPC) has been receiving increasing attention, existing methods are often inadequate as they either cannot deliver satisfactory detection performance or cannot cope with massive amounts of complex data. In this paper, we propose a novel k-nearest neighbors empirical cumulative sum (KNN-ECUSUM) control chart for monitoring multivariate data by utilizing historical data under in-control and out-of-control scenarios. Our proposed method utilizes the k-nearest neighbors (KNN) algorithm for dimension reduction to transform multi-attribute data into univariate data, and then applies the CUSUM procedure to monitor the change in the empirical distributions of the transformed univariate data. Simulation studies and a real industrial example based on a disk monitoring system demonstrates the effectiveness of our proposed method.
منابع مشابه
Fast Approximate kNN Graph Construction for High Dimensional Data via Recursive Lanczos Bisection
Nearest neighbor graphs are widely used in data mining and machine learning. A brute-force method to compute the exact kNN graph takes Θ(dn2) time for n data points in the d dimensional Euclidean space. We propose two divide and conquer methods for computing an approximate kNN graph in Θ(dnt) time for high dimensional data (large d). The exponent t ∈ (1,2) is an increasing function of an intern...
متن کاملA Study of kNN using ICU Multivariate Time Series Data
A time series is a sequence of data collected at successive time points. While most techniques for time series analysis have been focused on univariate time series data at fixed intervals, there are many applications where time series data are collected at irregular and uncertain time intervals across multiple input variables. The uncertainty in multivariate time series makes analysis difficult...
متن کاملRice Seed Cultivar Identification Using Near-Infrared Hyperspectral Imaging and Multivariate Data Analysis
A near-infrared (NIR) hyperspectral imaging system was developed in this study. NIR hyperspectral imaging combined with multivariate data analysis was applied to identify rice seed cultivars. Spectral data was exacted from hyperspectral images. Along with Partial Least Squares Discriminant Analysis (PLS-DA), Soft Independent Modeling of Class Analogy (SIMCA), K-Nearest Neighbor Algorithm (KNN) ...
متن کاملON SUPERVISED AND SEMI-SUPERVISED k-NEAREST NEIGHBOR ALGORITHMS
The k-nearest neighbor (kNN) is one of the simplest classification methods used in machine learning. Since the main component of kNN is a distance metric, kernelization of kNN is possible. In this paper kNN and semi-supervised kNN algorithms are empirically compared on two data sets (the USPS data set and a subset of the Reuters-21578 text categorization corpus). We use a soft version of the kN...
متن کاملKNN Model-Based Approach in Classification
The k-Nearest-Neighbours (kNN) is a simple but effective method for classification. The major drawbacks with respect to kNN are (1) its low efficiency being a lazy learning method prohibits it in many applications such as dynamic web mining for a large repository, and (2) its dependency on the selection of a “good value” for k. In this paper, we propose a novel kNN type method for classificatio...
متن کامل